AITopics | clustering and outlier detection

Collaborating Authors

clustering and outlier detection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Practical Algorithm for Distributed Clustering and Outlier Detection

Neural Information Processing SystemsNov-20-2025, 23:16:46 GMT

We study the classic k-means/median clustering, which are fundamental problems in unsupervised learning, in the setting where data are partitioned across multiple sites, and where we are allowed to discard a small portion of the data by labeling them as outliers. We propose a simple approach based on constructing small summary for the original dataset. The proposed method is time and communication efficient, has good approximation guarantees, and can identify the global outliers effectively. To the best of our knowledge, this is the first practical algorithm with theoretical guarantees for distributed clustering with outliers. Our experiments on both real and synthetic data have demonstrated the clear superiority of our algorithm against all the baseline algorithms in almost all metrics.

clustering and outlier detection, name change, practical algorithm, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Add feedback

On Integrated Clustering and Outlier Detection

Lionel Ott, Linsey Pang, Fabio T. Ramos, Sanjay Chawla

Neural Information Processing SystemsFeb-9-2025, 07:20:38 GMT

The advantages of combining clustering and outlier selection include: (i) the resulting clusters tend to be compact and semantically coherent (ii) the clusters are more robust against data perturbations and (iii) the outliers are contextualised by the clusters and more interpretable. We provide a practical subgradient-based algorithm for the problem and also study the theoretical properties of algorithm in terms of approximation and convergence. Extensive evaluation on synthetic and real data sets attest to both the quality and scalability of our proposed method.

artificial intelligence, data mining, machine learning, (12 more...)

Neural Information Processing Systems

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.99)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.66)

Add feedback

Reviews: A Practical Algorithm for Distributed Clustering and Outlier Detection

Neural Information Processing SystemsOct-8-2024, 10:26:57 GMT

The paper addresses the problem of performing the distributed k-mean/median clustering in the presence of outliers, and at the same time identifying the outliers. Data are partitioned across multiple sites either adversarially or randomly, and the sites and a central coordinator work jointly by communications to get the data clustering and the outliers. The authors proposed a practical algorithm with bounded running time O(max{k,\log n} n), and bounded communication cost O(s(k\log n t)) and O(sk\log n t) for adversarial and random data partitioning respectively, for a dataset with n data points, k centers, t outliers, and partitioned across s sites. They used a traditional two-level clustering framework (Guha et al. 2017). If using a \gamma -approximation algorithm for (k,t)-mean/median as the second-level clustering algorithm, their distributed algorithm has a bounded O(\gamma) approximation factor. Extensive experimental studies were conducted to compare the performance of their algorithm with three baseline algorithms.

artificial intelligence, data mining, machine learning, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.58)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.40)

Add feedback

A Practical Algorithm for Distributed Clustering and Outlier Detection

Chen, Jiecao, Azer, Erfan Sadeqi, Zhang, Qin

Neural Information Processing SystemsFeb-14-2020, 10:15:21 GMT

artificial intelligence, data mining, machine learning, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.96)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.40)

Add feedback

On Integrated Clustering and Outlier Detection

Ott, Lionel, Pang, Linsey, Ramos, Fabio T., Chawla, Sanjay

Neural Information Processing SystemsDec-31-2014

We model the joint clustering and outlier detection problem using an extension of the facility location formulation. The advantages of combining clustering and outlier selection include: (i) the resulting clusters tend to be compact and semantically coherent (ii) the clusters are more robust against data perturbations and (iii) the outliers are contextualised by the clusters and more interpretable. We provide a practical subgradient-based algorithm for the problem and also study the theoretical properties of algorithm in terms of approximation and convergence. Extensive evaluation on synthetic and real data sets attest to both the quality and scalability of our proposed method.

artificial intelligence, data mining, machine learning, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.99)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.86)

Add feedback